Skip to content

docs: add first training run runbook with pre-flight checklist#99

Merged
abrichr merged 1 commit into
mainfrom
docs/training-runbook
Mar 4, 2026
Merged

docs: add first training run runbook with pre-flight checklist#99
abrichr merged 1 commit into
mainfrom
docs/training-runbook

Conversation

@abrichr
Copy link
Copy Markdown
Member

@abrichr abrichr commented Mar 4, 2026

Summary

  • Add comprehensive training runbook for running GRPO/GiGPO training loops on WAA
  • Includes pre-flight checklist for Azure WAA VM and AWS GPU VM verification
  • Covers instance selection, launch commands, monitoring, iteration plan, common failure modes, and success criteria

Test plan

  • Verify markdown renders correctly on GitHub
  • Confirm all checklist items are actionable
  • Validate launch commands against current verl-agent/VAGEN config schema

Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@abrichr abrichr merged commit dd6b6fc into main Mar 4, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant